Markov Decision Processes: Discounted Expected Reward or Average Expected Reward?

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Average-Reward Decentralized Markov Decision Processes

Formal analysis of decentralized decision making has become a thriving research area in recent years, producing a number of multi-agent extensions of Markov decision processes. While much of the work has focused on optimizing discounted cumulative reward, optimizing average reward is sometimes a more suitable criterion. We formalize a class of such problems and analyze its characteristics, show...

متن کامل

Continuous Time Markov Decision Processes with Expected Discounted Total Rewards

Abstract. This paper discusses continuous time Markov decision processes with criterion of expected discounted total rewards, where the state space is countable, the reward rate function is extended real-valued and the discount rate is a real number. Under necessary conditions that the model is well defined, the state space is partitioned into three subsets, on which the optimal value function ...

متن کامل

Total Expected Discounted Reward MDPs: Existence of Optimal Policies

This article describes the results on the existence of optimal and nearly optimal policies for Markov Decision Processes (MDPs) with total expected discounted rewards. The problem of optimization of total expected discounted rewards for MDPs is also known under the name of discounted dynamic programming.

متن کامل

Pseudometrics for State Aggregation in Average Reward Markov Decision Processes

We consider how state similarity in average reward Markov decision processes (MDPs) may be described by pseudometrics. Introducing the notion of adequate pseudometrics which are well adapted to the structure of the MDP, we show how these may be used for state aggregation. Upper bounds on the loss that may be caused by working on the aggregated instead of the original MDP are given and compared ...

متن کامل

Geometric convergence in average reward Markov decision processes

• A submitted manuscript is the author's version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Mathematical Analysis and Applications

سال: 1993

ISSN: 0022-247X

DOI: 10.1006/jmaa.1993.1031